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Abstract 



We use finite-size scaling to study the critical behavior of the quasispecies 

model of molecular evolution in the single-sharp-peak replication landscape. 

This model exhibits a sharp threshold phenomenon at Q = Q c = 1/a, where 

Q is the probability of exact replication of a molecule of length L and a is 

the selective advantage of the master string. We investigate the sharpness of 

the threshold and find that its characteristics persist across a range of Q of 

order L~ l about Q c - Furthermore, using the data collapsing method we show 

that the normalized mean Hamming distance between the master string and 

the entire population, as well as the properly scaled fluctuations around this 

mean value, follow universal forms in the critical region. 
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Although the so-called error threshold phenomenon, which limits the length L of com- 
peting self-reproducing molecules, is acknowledged as one of the main outcomes of Eigen's 
quasispecies model [0,0 , the full characterization of the error threshold transition for finite L 
has not been satisfactorily carried out yet. In fact, similarly to the definition of the critical 
temperature for finite lattices, there is no generally accepted definition of the term error 
threshold for finite L |§. Nevertheless, the study of the systematic deviations from the 
infinite length limit behavior introduced by the finite-size effects, besides being practically 
independent of the definition adopted, gives valuable information on the behavior of the 
relevant macroscopic quantities near the critical region [§,0. 

In the quasispecies model, a molecule is represented by a string of L digits s = 
(si, S2, ■ ■ ■ , sl), with the variables s a allowed to take on k different values, each representing 
a different type of monomer used to build the molecule. For sake of simplicity, in this work 
we will consider only binary strings, i.e., s a = 0,1. The concentrations Xi of molecules of 
type % — 1, 2, . . . , 2 L evolve in time according to the following differential equations 



where the constants Di stand for the death probability of molecules of type i, and is a 
dilution flux that keeps the total concentration constant. This flux introduces a nonlinearity 
in flU), and is determined by the condition J2i dxi/dt = 0. More pointedly, assuming D± = 
for all i and Yli x i — 1 yields 



dt 



]T W ijXj - [A + $ (*)] x. 
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(1) 



(2) 



The elements of the replication matrix W are given by 



Wu = Ai q L 



(3) 



and 



d(i,j) 



(4) 
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where Ai is the replication rate or fitness of molecules of type i, and d is the Hamming 
distance between strings i and j. Here < q < 1 is the single-digit replication accuracy, 
which is assumed to be the same for all digits. 

In this work we will consider the simplest and probably most studied replication land- 
scape, namely the single-sharp-peak replication landscape, in which we ascribe the repli- 
cation rate a > 1 to the so-called master string (0,0, ... ,0), and the replication rate 1 to 
the remaining strings. In this context, the parameter a is termed selective advantage of the 
master string. As the replication accuracy q decreases, two distinct regimes are observed in 
the population composition: the quasispecies regime characterized by the master string and 
its close neighbors, and the uniform regime where the 2 L strings appear in the same propor- 
tion. The transition between these regimes occurs at the error threshold q c . To study this 
transition for large L it is more convenient to introduce the probability of exact replication 
of an entire string, namely 

Q = Q L , (5) 
so that for L — > oo the transition occurs at (T]U 

Qc = -■ (6) 

a 

Although there is a consensus that a thermodynamic order- disorder phase transition 
occurs in the limit L — > oo only there is some disagreement on the order of the 

transition. On the one hand, the mapping of the steady-state solution of the chemical kinetic 
equations (|]) into the surface properties of a semi-infinite two-dimensional lattice system 
in thermodynamic equilibrium indicates that the relevant order parameter, namely, the 
mean normalized Hamming distance d between the master string and the entire population, 
vanishes continuously at Q c [0. However, due to the enormous difficulty of solving the 
self-consistent equations that describe the equilibrium surface properties that analysis was 
restricted to L — 20 J7[. On the other hand, a thorough investigation of an alternative 
mapping of equations (|l|) into a problem of directed polymers in a random medium indicates 
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that the concentration of master strings presents a discontinuity at Q = Q c ||. Since this 
mapping allows for the exact solution of the quasispecies model in the single-sharp-peak 
replication landscape for generic lengths L, that result implies that the transition for L — > oo 
is definitely of first order || . 

The aim of this work is to investigate the finite-size effects near the error threshold 
transition. Of particular interest is the determination of the sharpness of the threshold, 
namely, the range of Q about Q c where the threshold characteristics persist. As we expect 
that the size of this region shrinks to zero like L~ x l v as L — ■> oo, our goal is to estimate the 
value of the exponent v > using finite-size scaling or, more precisely, the data collapsing 
method ||. Our approach is in the same spirit of the finite-size scaling of combinatoric 
problems 0, for which there is no geometric criterion for defining a quantity analogous 
to the correlation length £, and so the success of the method in accounting for the size 
dependence of the order parameters cannot be attributed to the divergence of £ and the 
consequent onset of a second order phase transition. In fact, instead of attempting to 
map the chemical kinetic equations (|I|) into a equilibrium statistical mechanics problem, we 
resort to a simpler and more direct approach, namely, the exact numerical solution of those 
equations in the steady-state regime for molecule lengths up to L = 150. 

As pointed out by Swetina and Schuster fl(|, for the single-sharp-peak replication land- 
scape the 2 L molecular concentrations Xi can be grouped into L + l distinct classes according 
to their Hamming distances to the master string. This procedure allows the description of 
the chemical kinetics by the following L + l coupled first-order differential equations [Kj 



dY, 



p 



L 



]T M PR Y R + (a - 1) Y M P0 -Y P [1 + Y (a - 1)] , (7) 



where Yp denotes the concentrations of molecules in class P = 0, . . . , L. Clearly, J2p Yp — 1- 
Here Mpp stands for the probability of mutation from a molecule of type R to a molecule 
of type P and is given by 
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q L-P~R + 2I (1 _ g) P+i?-2/ ; (g) 



where l\ = max (0, P + R — L) and I u = min (P, R). 

The procedure to obtain the steady-state solution dYp/dt = of Eqs. (|7|) is straightfor- 
ward. The steady-state concentrations Yp for P — 0, . . . , L can be easily found by solving 
by iterations the following set of equations 

v ELo m pr y R + (a-l) Y M P0 

Yp ~ i + y (a-i) • (9) 

Interestingly, the iteration of these equations is identical to the dynamics of a recently 
proposed population genetics model based on the neglect of the linkage disequilibrium at 
the population level |Q|. 



The relevant quantities to describe the structure of the population are the normalized 
mean Hamming distance between the master string and the whole population, defined by 

d = jJ2PY P , (10) 

and the average of the squared deviations around d, 

a 2 = L 2 £(j-d) 2 Y P . (11) 
p=o v L J 

Clearly, d and a 2 are the analogous to the magnetization and susceptibility in a system of 
Ising spins. In Figs, [l] and @ we present d and a 2 , respectively, as functions of the properly 
normalized probability of exact replication Q/Q c . As expected, the results of Fig. [I] show 
the sharpening of the transition with increasing L. Furthermore, all curves intersect at an 
unique point, namely, the critical point Q = Q c . This somewhat unexpected result has 
proved very useful to locate the threshold in the case that its location is not known apriori 
||. The curves shown in Fig. indicate that the height of the peak of a 2 , denoted by cx^^, 
increases with increasing L like L 7//iy . As illustrated in the inset, the ratio 7/V is given by 
the slope of the straight line fitting the data points in a plot of In <y^ nax versus InL. The 
result j/v = 1.96 is in good agreement with the analytical prediction that the rms amplitude 
of a quasispecies around the master string (y/o^) is found to diverge algebrically with the 
exponent 1 as Q — > Q c M. 



The exponent \jv is estimated using the standard data collapsing method || as illus- 
trated in Figs. |3] and |], where we plot d and L~ 7 / iy cr 2 , respectively, versus L x l v e. Here 
e = (Q — Q c ) /Q c is the reduced probability of exact replication. The collapse of the curves 
for different L was achieved with the exponents 1/V = 1 and j/u — 1.958 regardless of the 
value of the selective advantage parameter a, indicating then the universal character of these 
exponents. However, as shown in these figures, the universal forms (i.e., scaling functions) 
followed by the properly scaled order parameters in the critical region depend on a. Since 
v = 1, we note that the characteristics of the threshold transition persist across a range of 
Q of order L~ l about Q c = I /a. 

As in the case of combinatoric problems ||, it is surprising that finite-size scaling is 
so effective to characterize the error threshold transition of the quasispecies model in the 
single-sharp-peak replication landscape, which is known to be of first order |J. The collapse 
of the data for different L into a single, universal curve presented in Figs. |3] and |], however, 
is an incontestable evidence of the usefulness of the finite-size scaling method to investigate 
threshold phenomena. In fact, the existence of the universal forms presented in those figures 
together with the characterization of the sharpness of the error threshold are the main results 
of this paper. 
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FIGURES 

FIG. 1. Normalized mean Hamming distance between the master string and the whole popula- 
tion d as a function of the normalized probability of exact replication Q/Q c for a = 10, and L = 70 
(□), 100 (O), 120 (A) and 150 (x). 

FIG. 2. Standard deviation a 2 as a function of the normalized probability of exact replication 
Q/Qc- The inset illustrates the procedure used to estimate the ratio 7/1/. The parameters and 
convention are the same as for Fig. |l[ 

FIG. 3. Normalized mean Hamming distance as a function of the scaled reduced probability of 
exact replication. The parameters are \/v = 1 and (from bottom to top) a = 10, 20 and 50. The 
convention is the same as for Fig. |j. 

FIG. 4. Scaled standard deviation as a function of the scaled reduced probability of exact 
replication. The parameters are 1/v = 1, 7/1/ = 1.958, and (from top to bottom at the peak 
location) a = 10, 20 and 50. The convention is the same as for Fig. |l]. 
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